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Abstract 

We highlight a very simple statistical tool for the analysis of financial bubbles, which has already been studied 
in pQ. We provide extensive empirical tests of this statistical tool and investigate analytically its link with stocks 
correlation structure. 

Introduction 

Forecasting the burst of financial bubbles would be incredibly useful for many players in stock exchanges, including 
regulators, portfolio managers and investment banks. Fundamental indicators relying on economic analysis can 
be monitored. But is it possible to find some statistical regularity in market crashes? Several authors, including 
[U[3l|4], have already attempted to answer this question. In this paper, we focus on a very simple statistical tool, 
first introduced in pQ, and study analytically its link with stocks correlation structure. This approach is similar to 
the one studied in [3] although different through the statistical object under consideration. 

1 A spatial survival function 

In PQ, an unusual and interesting statistical tool is introduced in order to study market crashes. Given N stocks 
on a market place, a reference datc0 t re f and the current date t, we set 
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where Xi(t re f,t) := ^ is the performance of stock i over [t re f,t]. In the following, we shall leave the time 

argument for notational simplicity. Intuitively, Sn(z) can be seen as the proportion of stocks displaying a greater 
performance than z, i.e. the survival function of stocks on day t. From this point of view, it is a measure of 
stocks dispersion: a slow decreasing Sn(z) indicates broadly distributed performances, thus reflecting an important 
dispersion. Since t re f is supposed to be as close as possible to the onset of the bubble, Xi(t re f,t) might be of the 
order of monthly or yearly returns. Short notes on similar statistical objects can be found in [51 [5]. This is very 
different from looking at daily returns as in [H H] and might be more relevant for bubble detection since it often 
- - i takes long time for a bubble to build up and for bubbling stocks to disperse. 



Statistical properties of Sn(z) are interestingly robust. The main features of this quantity are: 

• for z ~ +oo, Sn(z) ~ z~ Q ; 

• the variance of the AVs increases dramatically before crashes. 

These two features are robust with respect to the choice of the arbitrary reference date t re f and are valid over 
several financial crashes. We illustrate these two facts on figures Q] and [2] We use daily close prices of three different 
sets of stocks: the stocks composing the Australian Stock Exchange All Ordinaries index (AORD, 500 stocks); the 
stocks composing the New York Stock Exchange Composite index (NYA, 1800 stocks); the stocks composing the 
Shanghai Composite index (SSE, 900 stocks). We choose the first trading day of 2003 as our date t re f. As for the 
first fact, figure [T] shows three examples of distributions of X^s at random dates. It appears that the power-law 
tail is indeed a good fit as the normalized prices grow. As for the second fact, we plot on figure [2] the timeseries of 
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Figure 1: Survival functions of cross-sectionnal distributions of normalized prices on three sets of stocks at random 
dates, in log-log scale, (left) AORD stocks on Nov 24th, 2008; (center) NYA stocks on Dec 12th, 2005; (right) SSE 
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Figure 2: Evolutions of the mean ("average price") and variance of the Xi's on three sets of stocks from Jan 1st, 
2003 to Sep 3rd, 2009. (left) AORD stocks; (center) NYA stocks ; (right) SSE stocks 



the average and variance of the Xi's. It appears clearly that on all stocks, the variance grows dramatically as the 
bubble inflates, especially prior to crashes. 

We also observe that theses properties are robust with respect to the choice of t re f; on figure [3] (left), timeseries 
of the variance of the Xi's are plotted for eleven different reference dates (first trading day of the year from 1998 to 
2008). It appears that the bursts of variance exists whatever the date t re f. For the series that begin before 2000, 
the Internet bubble is clearly visible. The recent 2007 bubble is also visible for all series, even for the most recent 
reference dates (see zoom in inset). 

In pQ, the author argues that the power-law exponent a dives towards 1 before crashes, leading to the divergence 
of the mean performance and thus to the burst of the financial bubble. We find this threshold to be not robust 
with respect to t re f. on figure [3] (right), the Hill estimator of the Pareto exponent of the tail of the distributions of 
the Xi is plotted, and the value of the index clearly depends on the reference date t re /- However, the decrease in a 
and the rise of the variance are linked since the variance of a random variable distributed as a power-law of index 
a is given by ( a _ 1 y>( a _ 2 ) ■ On figure [3] (right), we observe that the Pareto exponent shows local minima and sharp 
peaks prior crashes. For example, first minimum is obtained on March 6th 2000 for all (started) time series, and 
the maximum of the Internet bubble is observed on March 9th 2000. More recently, all times series exhibit a sharp 
minimum on June 30th- July 1st 2008, prior to the large market dive of September 2008 (market height in 2008 is 
attained on June 6th). 



2 Link with the covariance structure 

On the one hand, financial bubbles happen when market quotes of certain sectors of the economy are booming, 
thus increasing the dispersion over whole stocks. On the other hand, it is common knowledge that market crashes 
are associated to bursts of volatility and correlation. How can we relate these two phenomena? Empirically, the 
rise in dispersion is given by the increase in the variance of performances Vn, which can be computed as follows 

r+oo / p+oo \ 2 
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Figure 3: (left) Variance of the normalized prices Xi for eleven different reference dates t re f. (right) Hill estimator 
of the Pareto exponent of the tail of distribution of the normalized prices for eleven different reference dates 
t re f. (all) These graphs are computed using daily close prices of the stocks composing the US Russcll3000 index 
(3000 stocks). 



where Sn(z) = ~^Sn(z) = jj ^2%=i — z) is the empirical probability density function associated to perfor- 
mances. It is easily seen that 
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with Xn := jfJ^iLi Since Vn is a sum of random variables, we would like to apply a convergence theorem 
such as the law of large numbers. However, we must be cautious since the X^s are correlated and not identically 
distributed. Vn will converge towards its mean if its variance is asymptotically nil. We set rrii := E(-Xj), of := 
Var(Xj), pij := Corr (X%,Xj). Straightforward computations show that 
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These two quantities exist provided that c"f := E ( XfX^ < oo for all (i, j) and (a, /?) e £ := {(7,6) e {0,1,2,3,4} 2 
7 + 5 < 4|. If the sequences |c° ) '' 3 , i,j = l,...,N, (a, (3) £ £ j are such that limjv->+oo Var (Vat) = 0, then 
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The above equation relates the dispersion of stocks Vn to their variance-covariance (and mean) structure. The 
dispersion Vn- 

• increases as individual variances increase: if stocks are individually very unstable, then it is very likely that 
the whole market will be so; 

• increases as covariances, in particular as correlations, decrease: the more anti-correlated stocks are, the more 
distant each pair will be, so that the whole dispersion increases. 



Both effects are illustrated on table [TJ We simulate a Gaussian vector with mean zero in dimension N — 1000 
a M = 100 times, then compute the mean of Vn over these M simulations for different levels of correlation or 
standard deviation. 



Assume that the initial X^s are centered and normalized so that rrii = and Oi = 1. We are then able to give 
bounds on lim Vat depending on correlation. Setting = 1 for all i,j leads to E(Vkr) = and p^ = —I for all i,j 
leads to E (Vat) = 2, as shown in table [TJ 
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p 


Mean V N 


-1 


1.997 


-0.8 


1.808 


-0.6 


1.595 


-0.4 


1.393 


-0.2 


1.199 


o 


0.999 


0.2 


0.797 


0.4 


0.603 


0.6 


0.400 


0.8 


0.199 


1 


7.430 xl0~ i;5 



Table 1: Simulation of a Gaussian vector with different 
decreases, the dispersion is larger. Standard deviations 
right) table. 



a 


Mean (xl0 ia ) 


0.1 


0.056 


0.3 


0.377 


0.5 


1.654 


0.7 


3.052 


0.9 


3.866 


1.1 


5.895 


1.3 


8.765 


1.5 


14.249 


1.7 


13.104 


1.9 


22.164 



levels of correlation and standard deviation. As correlation 
(resp. correlations) are set to 1 (resp. 0.5) in the left (resp. 



3 Conclusion and further research 

We have studied empirically and analytically an indicator of bubbles build up and burst on financial markets. This 
statistical tool is the variance of fixed starting date performances over the stocks universe. It is quite robust with 
respect to the choice of the market place and the starting date. The .com and subprime bubbles are well identified 
by this methodology Fundamentally, wc establish a link between the building up of a bubble and anti-correlation 
between stocks as well as individual variances bursts. 

Regarding further research, wc have two things in mind: 

• test empirically the link we establish between bubbles and stocks variance-covariance structure; 

• suggest an agent-based model to explain the fundamental microscopic mechanisms underlying this link between 
bubbles and stocks variance-covariance structure. 

The first direction requires a statistical measure for the variance and correlation of Xi (t re t , t) over i at each time 
t. This problem is quite complex in practice since, unless one goes 30 years back, one does not have historical daily 
data to compute them. A solution might be found in the use of high frequency data by computing the variances and 
correlations needed over the returns of the previous day. Furthermore, having access to these statistical quantities 
would be useful for normalizing returns in order to bound the market variance Vn, thus making it an indicator with 
fundamental thresholds. 

Some exchange models for wealth distribution, such as j7j , have striking similarities with our approach and could 
be therefore used to understand the power law distribution of stocks ensemble and global variance burst during 
bubbles from a microscopic point of view towards interactions between individual stocks. 
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